125 research outputs found
Income Thresholds and Income Classes
This paper proposes a method for detecting income classes based on the change-point problem. There is an increasing demand for such a method in the literature. Computation of polarization indices requires a pre-grouping of the incomes. Similarly, indices of social exclusion and sometimes indices of income inequality require detection of thresholds. The estimation procedure is implemented using a bootstrap technique. Finally, an application of the method to EU member states and to the United States is also considered.income distribution, change-point, thresholds.
Functional Data Representation with Merge Trees
In this paper we face the problem of representation of functional data with
the tools of algebraic topology. We represent functions by means of merge trees
and this representation is compared with that offered by persistence diagrams.
We show that these two tree structures, although not equivalent, are both
invariant under homeomorphic re-parametrizations of the functions they
represent, thus allowing for a statistical analysis which is indifferent to
functional misalignment. We employ a novel metric for merge trees and we prove
a few theoretical results related to its specific implementation when merge
trees represent functions. To showcase the good properties of our topological
approach to functional data analysis, we first go through a few examples using
data generated {\em in silico} and employed to illustrate and compare the
different representations provided by merge trees and persistence diagrams, and
then we test it on the Aneurisk65 dataset replicating, from our different
perspective, the supervised classification analysis which contributed to make
this dataset a benchmark for methods dealing with misaligned functional data
Hierarchical independent component analysis: A multi-resolution non-orthogonal data-driven basis
A new method named Hierarchical Independent Component Analysis is presented, particularly
suited for dealing with two problems regarding the analysis of high-dimensional
and complex data: dimensional reduction and multi-resolution analysis. It takes into account
the Blind Source Separation framework, where the purpose is the research of a basis
for a dimensional reduced space to represent data, whose basis elements represent physical
features of the phenomenon under study. In this case orthogonal basis could be not
suitable, since the orthogonality introduces an artificial constraint not related to the phenomenological
properties of the analyzed problem. For this reason this new approach is
introduced. It is obtained through the integration between Treelets and Independent Component
Analysis, and it is able to provide a multi-scale non-orthogonal data-driven basis.
Furthermore a strategy to perform dimensional reduction with a non orthogonal basis is
presented and the theoretical properties of Hierarchical Independent Component Analysis
are analyzed. Finally HICA algorithm is tested both on synthetic data and on a real dataset
regarding electroencephalographic traces
On the role of statistics in the era of big data: A call for a debate
While discussing the plenary talk of Dunson (2016) at the 48th Scientific Meeting of the Italian Statistical Society, I formulated a few general questions on the role of statistics in the era of big data which stimulated an interesting debate. They are reported here with the aim of engaging a larger audience on an issue which promises to change radically our discipline and, more generally, science as we know it. But is it so
Object Oriented Geostatistical Simulation of Functional Compositions via Dimensionality Reduction in Bayes spaces
We address the problem of geostatistical simulation of spatial complex
data, with emphasis on functional compositions (FCs). We pursue an object oriented
geostatistical approach and interpret FCs as random points in a Bayes Hilbert
space. This enables us to deal with data dimensionality and constraints by relying
on a solid geometric basis, and to develop a simulation strategy consisting of: (i) optimal
dimensionality reduction of the problem through a simplicial principal component
analysis, and (ii) geostatistical simulation of random realizations of FCs via
an approximate multivariate problem.We illustrate our methodology on a dataset of
natural soil particle-size densities collected in an alluvial aquifer
A Class-Kriging predictor for Functional Compositions with Application to Particle-Size Curves in Heterogeneous Aquifers
This work addresses the problem of characterizing the spatial field of soil
particle-size distributions within a heterogeneous aquifer system. The medium is conceptualized
as a composite system, characterized by spatially varying soil textural
properties associated with diverse geomaterials. The heterogeneity of the system is
modeled through an original hierarchical model for particle-size distributions that are
here interpreted as points in the Bayes space of functional compositions. This theoretical
framework allows performing spatial prediction of functional compositions
through a functional compositional Class-Kriging predictor. To tackle the problem of
lack of information arising when the spatial arrangement of soil types is unobserved,
a novel clustering method is proposed, allowing to infer a grouping structure from
sampled particle-size distributions. The proposed methodology enables one to project
the complete information content embedded in the set of heterogeneous particle-size
distributions to unsampled locations in the system. These developments are tested on
a field application relying on a set of particle-size data observed within an alluvial
aquifer in the Neckar river valley, in Germany
- …